Weibull minimum distribution (weibull_min)#
The Weibull distribution (SciPy: scipy.stats.weibull_min) is a flexible continuous distribution on \([0,\infty)\) that is widely used to model lifetimes / time-to-failure.
Its most important practical feature is that its hazard rate can be decreasing, constant, or increasing depending on the shape parameter — making it a go-to model in reliability engineering and survival analysis.
Learning goals#
Write down the PDF/CDF (and survival/hazard) and connect them to intuition.
Interpret the shape and scale parameters via the hazard rate.
Derive moments (mean/variance) using Gamma-function integrals.
Sample from the distribution using a NumPy-only inverse-CDF method.
Use
scipy.stats.weibull_minfor evaluation, sampling, and fitting.
import numpy as np
import plotly
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
import scipy
from scipy import optimize, special
from scipy.stats import weibull_min as weibull_min_dist
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=5, suppress=True)
rng = np.random.default_rng(42)
# Record versions for reproducibility (useful when numerical details matter).
VERSIONS = {"numpy": np.__version__, "scipy": scipy.__version__, "plotly": plotly.__version__}
1) Title & Classification#
Name:
weibull_min(Weibull minimum distribution; SciPy:scipy.stats.weibull_min)Type: Continuous
Support (standard form): \(x \in [0,\infty)\)
Parameter space (standard form): shape \(c>0\)
SciPy location/scale:
loc \in \mathbb{R},scale > 0with $\(X = \text{loc} + \text{scale}\,Y, \qquad Y \sim \mathrm{WeibullMin}(c).\)$
Unless stated otherwise, this notebook uses the standard form (loc=0, scale=1).
Note on naming: this is the usual Weibull distribution used for positive lifetimes. SciPy also provides
weibull_max, which is a reflected version used in some extreme-value contexts.
2) Intuition & Motivation#
What it models#
The Weibull distribution is a workhorse model for positive durations and lifetimes. A central concept in reliability is the hazard rate (instantaneous failure rate)
For a Weibull with shape \(c\) and scale \(\lambda\) (and loc=0), the hazard is a simple power law:
This gives a clean interpretation:
\(c<1\): decreasing hazard (“infant mortality” / early failures)
\(c=1\): constant hazard (memoryless exponential case)
\(c>1\): increasing hazard (“wear-out” / aging)
Typical real-world use cases#
Reliability / life testing: time-to-failure of components, fatigue life.
Survival analysis: parametric survival model when hazards are monotone.
Wind speed and hydrology: positive-valued environmental measurements.
Material strength: weakest-link arguments often motivate Weibull-like models.
Relations to other distributions#
Exponential: if \(c=1\), then \(X\sim \mathrm{Exp}(\text{scale}=\lambda)\).
Rayleigh: if \(c=2\) and \(\lambda=\sqrt{2}\,\sigma\), then \(X\) is Rayleigh(\(\sigma\)).
Gumbel (minimum) via log transform: if \(X\sim\mathrm{Weibull}(c,\lambda)\) and \(Y=\log X\), then $\(F_Y(y)=\mathbb{P}(Y\le y)=1-\exp\{-\exp(c(y-\log\lambda))\},\)$ which is a Gumbel-min (left-skewed extreme value) distribution.
Generative story from an exponential: if \(T\sim\mathrm{Exp}(1)\) then $\(X = \lambda\,T^{1/c} \sim \mathrm{Weibull}(c,\lambda).\)$ This directly yields an efficient sampler.
3) Formal Definition#
We use the common reliability parameterization with shape \(c>0\) and scale \(\lambda>0\) (SciPy’s scale). In the standard form, \(\lambda=1\).
Let \(z = \frac{x-\text{loc}}{\lambda}\).
PDF#
For \(x\ge \text{loc}\),
CDF#
For \(x\ge \text{loc}\),
Equivalently, the survival function is
Hazard rate#
Whenever \(S(x)>0\) (i.e., for finite \(x\)),
Quantile function (PPF)#
For \(0<q<1\),
def weibull_min_pdf(x: np.ndarray, c: float) -> np.ndarray:
"""PDF of the standard WeibullMin(c) distribution (loc=0, scale=1)."""
x = np.asarray(x, dtype=float)
if c <= 0:
return np.full_like(x, np.nan, dtype=float)
out = np.zeros_like(x, dtype=float)
mask = x >= 0
xm = x[mask]
with np.errstate(divide="ignore", invalid="ignore", over="ignore"):
out[mask] = c * np.power(xm, c - 1.0) * np.exp(-np.power(xm, c))
return out
def weibull_min_logpdf(x: np.ndarray, c: float) -> np.ndarray:
"""Log-PDF of the standard WeibullMin(c) distribution (stable for tiny densities)."""
x = np.asarray(x, dtype=float)
if c <= 0:
return np.full_like(x, np.nan, dtype=float)
out = np.full_like(x, -np.inf, dtype=float)
# Strictly positive region.
mask_pos = x > 0
xp = x[mask_pos]
out[mask_pos] = np.log(c) + (c - 1.0) * np.log(xp) - np.power(xp, c)
# Boundary at 0.
mask_zero = x == 0
if np.any(mask_zero):
if np.isclose(c, 1.0):
out[mask_zero] = 0.0 # pdf(0)=1
elif c < 1.0:
out[mask_zero] = np.inf # pdf(0)=+inf
else:
out[mask_zero] = -np.inf # pdf(0)=0
return out
def weibull_min_cdf(x: np.ndarray, c: float) -> np.ndarray:
"""CDF of the standard WeibullMin(c) distribution."""
x = np.asarray(x, dtype=float)
if c <= 0:
return np.full_like(x, np.nan, dtype=float)
out = np.zeros_like(x, dtype=float)
mask = x >= 0
xm = x[mask]
# Use expm1 for accuracy near x=0: 1 - exp(-t) = -expm1(-t).
t = np.power(xm, c)
out[mask] = -np.expm1(-t)
return out
def weibull_min_sf(x: np.ndarray, c: float) -> np.ndarray:
"""Survival function S(x)=P(X>x) for the standard WeibullMin(c)."""
x = np.asarray(x, dtype=float)
if c <= 0:
return np.full_like(x, np.nan, dtype=float)
out = np.ones_like(x, dtype=float)
mask = x >= 0
xm = x[mask]
out[mask] = np.exp(-np.power(xm, c))
return out
def weibull_min_hazard(x: np.ndarray, c: float) -> np.ndarray:
"""Hazard rate h(x)=f(x)/S(x) for the standard WeibullMin(c)."""
x = np.asarray(x, dtype=float)
if c <= 0:
return np.full_like(x, np.nan, dtype=float)
out = np.zeros_like(x, dtype=float)
mask = x >= 0
xm = x[mask]
with np.errstate(divide="ignore", invalid="ignore", over="ignore"):
out[mask] = c * np.power(xm, c - 1.0)
return out
def weibull_min_ppf(q: np.ndarray, c: float) -> np.ndarray:
"""Quantile function (inverse CDF) of the standard WeibullMin(c)."""
q = np.asarray(q, dtype=float)
if c <= 0:
return np.full_like(q, np.nan, dtype=float)
out = np.full_like(q, np.nan, dtype=float)
mask = (q >= 0) & (q <= 1)
qm = q[mask]
out[mask] = np.power(-np.log1p(-qm), 1.0 / c)
out[q == 0] = 0.0
out[q == 1] = np.inf
return out
def weibull_min_pdf_loc_scale(x: np.ndarray, c: float, *, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""PDF with SciPy-style loc/scale using the standard-form implementation."""
x = np.asarray(x, dtype=float)
if scale <= 0:
return np.full_like(x, np.nan, dtype=float)
z = (x - loc) / scale
out = np.zeros_like(x, dtype=float)
mask = z >= 0
out[mask] = weibull_min_pdf(z[mask], c) / scale
return out
def weibull_min_cdf_loc_scale(x: np.ndarray, c: float, *, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""CDF with SciPy-style loc/scale using the standard-form implementation."""
x = np.asarray(x, dtype=float)
if scale <= 0:
return np.full_like(x, np.nan, dtype=float)
z = (x - loc) / scale
out = np.zeros_like(x, dtype=float)
mask = z >= 0
out[mask] = weibull_min_cdf(z[mask], c)
return out
def weibull_min_ppf_loc_scale(q: np.ndarray, c: float, *, loc: float = 0.0, scale: float = 1.0) -> np.ndarray:
"""PPF with SciPy-style loc/scale using the standard-form implementation."""
if scale <= 0:
q = np.asarray(q, dtype=float)
return np.full_like(q, np.nan, dtype=float)
return loc + scale * weibull_min_ppf(q, c)
# Sanity check: our formulas match SciPy (standard form and loc/scale).
c = 1.7
x = np.logspace(-4, 2, 25)
q = np.linspace(0.01, 0.99, 9)
rv_std = weibull_min_dist(c)
assert np.allclose(weibull_min_pdf(x, c), rv_std.pdf(x))
assert np.allclose(weibull_min_cdf(x, c), rv_std.cdf(x))
assert np.allclose(weibull_min_ppf(q, c), rv_std.ppf(q))
# loc/scale
loc, scale = -0.3, 2.5
rv_ls = weibull_min_dist(c, loc=loc, scale=scale)
assert np.allclose(weibull_min_pdf_loc_scale(x, c, loc=loc, scale=scale), rv_ls.pdf(x))
assert np.allclose(weibull_min_cdf_loc_scale(x, c, loc=loc, scale=scale), rv_ls.cdf(x))
assert np.allclose(weibull_min_ppf_loc_scale(q, c, loc=loc, scale=scale), rv_ls.ppf(q))
# hazard matches f/S in standard form (avoid survival underflow in the far tail)
x_haz = np.logspace(-4, 1, 30)
haz = weibull_min_pdf(x_haz, c) / weibull_min_sf(x_haz, c)
assert np.allclose(haz, weibull_min_hazard(x_haz, c))
4) Moments & Properties#
A convenient property of the Weibull distribution is that all positive moments exist and have a clean Gamma-function form.
Raw moments#
If \(X\sim\mathrm{Weibull}(c,\lambda)\) with loc=0, then for any \(r>-c\),
where \(\Gamma(\cdot)\) is the Gamma function.
Mean and variance#
Let \(g_k = \Gamma(1+k/c)\). Then
Skewness and kurtosis#
Using raw moments and central-moment identities, the third and fourth central moments are
Skewness and excess kurtosis are
MGF / characteristic function#
There is no simple elementary closed form for the moment generating function
but it can be written as a power series using the moments:
with a radius of convergence that depends on \(c\):
\(c>1\): \(M_X(t)\) exists for all real \(t\) (the tail is lighter than exponential).
\(c=1\): \(M_X(t)\) exists for \(t < 1/\lambda\) (exponential case).
\(0<c<1\): \(M_X(t)\) diverges for every \(t>0\) (tail is heavier than exponential).
The characteristic function \(\varphi_X(t)=\mathbb{E}[e^{itX}]\) exists for all real \(t\) (bounded integrand).
Entropy#
The differential entropy (for loc=0) has a simple closed form:
where \(\gamma\approx 0.57721\) is the Euler–Mascheroni constant.
EULER_GAMMA = float(-special.digamma(1.0)) # Euler–Mascheroni constant γ
def weibull_min_raw_moment(r: float, c: float, *, scale: float = 1.0) -> float:
"""E[X^r] for Weibull(c, scale) with loc=0."""
if c <= 0 or scale <= 0:
return float("nan")
return float((scale**r) * special.gamma(1.0 + r / c))
def weibull_min_mean(c: float, *, scale: float = 1.0, loc: float = 0.0) -> float:
if c <= 0 or scale <= 0:
return float("nan")
return float(loc + scale * special.gamma(1.0 + 1.0 / c))
def weibull_min_variance(c: float, *, scale: float = 1.0) -> float:
if c <= 0 or scale <= 0:
return float("nan")
g1 = special.gamma(1.0 + 1.0 / c)
g2 = special.gamma(1.0 + 2.0 / c)
return float((scale**2) * (g2 - g1**2))
def weibull_min_skewness(c: float, *, scale: float = 1.0) -> float:
if c <= 0 or scale <= 0:
return float("nan")
g1 = special.gamma(1.0 + 1.0 / c)
g2 = special.gamma(1.0 + 2.0 / c)
g3 = special.gamma(1.0 + 3.0 / c)
mu3 = (scale**3) * (g3 - 3.0 * g1 * g2 + 2.0 * g1**3)
var = (scale**2) * (g2 - g1**2)
return float(mu3 / (var ** 1.5))
def weibull_min_excess_kurtosis(c: float, *, scale: float = 1.0) -> float:
if c <= 0 or scale <= 0:
return float("nan")
g1 = special.gamma(1.0 + 1.0 / c)
g2 = special.gamma(1.0 + 2.0 / c)
g3 = special.gamma(1.0 + 3.0 / c)
g4 = special.gamma(1.0 + 4.0 / c)
mu4 = (scale**4) * (g4 - 4.0 * g1 * g3 + 6.0 * (g1**2) * g2 - 3.0 * g1**4)
var = (scale**2) * (g2 - g1**2)
return float(mu4 / (var**2) - 3.0)
def weibull_min_entropy(c: float, *, scale: float = 1.0) -> float:
if c <= 0 or scale <= 0:
return float("nan")
return float(1.0 + np.log(scale / c) + EULER_GAMMA * (1.0 - 1.0 / c))
# Compare to SciPy.
c = 1.5
scale = 2.0
rv = weibull_min_dist(c, scale=scale)
mean_sp, var_sp, skew_sp, kurt_sp = rv.stats(moments="mvsk")
entropy_sp = rv.entropy()
mean_f = weibull_min_mean(c, scale=scale)
var_f = weibull_min_variance(c, scale=scale)
skew_f = weibull_min_skewness(c, scale=scale)
kurt_f = weibull_min_excess_kurtosis(c, scale=scale)
entropy_f = weibull_min_entropy(c, scale=scale)
print("SciPy vs formulas (c=1.5, scale=2.0):")
print(" mean ", float(mean_sp), "|", mean_f)
print(" var ", float(var_sp), "|", var_f)
print(" skew ", float(skew_sp), "|", skew_f)
print(" kurt ", float(kurt_sp), "|", kurt_f)
print(" entropy", float(entropy_sp), "|", entropy_f)
assert np.allclose([mean_sp, var_sp, skew_sp, kurt_sp, entropy_sp], [mean_f, var_f, skew_f, kurt_f, entropy_f])
SciPy vs formulas (c=1.5, scale=2.0):
mean 1.805490585901867 | 1.805490585901867
var 1.5027611392557279 | 1.5027611392557279
skew 1.0719865728909583 | 1.0719865728909592
kurt 1.3904035615957744 | 1.3904035615957788
entropy 1.4800872940856251 | 1.4800872940856251
5) Parameter Interpretation#
SciPy’s weibull_min uses:
cas the shape parameter (often written \(k\) or \(\beta\))scaleas the scale parameter (often written \(\lambda\) or \(\eta\))locas a location shift
Shape c#
Controls the hazard rate behavior: $\(h(x)=\frac{c}{\lambda}\left(\frac{x}{\lambda}\right)^{c-1}.\)$
Also controls the shape near zero:
if \(c<1\), the density blows up at \(x=0\) (many very small values)
if \(c=1\), the density is finite at \(x=0\) (exponential)
if \(c>1\), the density is 0 at \(x=0\) and has a mode at \(x>0\)
Scale scale = \lambda#
Stretches the distribution horizontally: if \(Y\sim\mathrm{Weibull}(c,1)\) then \(X=\lambda Y\sim\mathrm{Weibull}(c,\lambda)\).
For lifetimes, \(\lambda\) is a characteristic life: \(F(\lambda)=1-e^{-1}\approx 0.632\).
Location loc#
Shifts support: support becomes \(x\ge \text{loc}\).
Useful for modeling a minimum lifetime (or a measurement offset).
# Shape effects: PDF and hazard for different c (standard form).
x_pdf = np.linspace(1e-4, 4.0, 800)
x_haz = np.logspace(-3, 1, 600)
c_values = [0.5, 1.0, 1.5, 3.0]
fig = go.Figure()
for c in c_values:
y = weibull_min_pdf(x_pdf, c)
fig.add_trace(go.Scatter(x=x_pdf, y=y, mode="lines", name=f"c={c}"))
fig.update_layout(
title="Weibull_min PDF for different shapes (scale=1)",
xaxis_title="x",
yaxis_title="f(x; c)",
)
fig.show()
fig = go.Figure()
for c in c_values:
y = weibull_min_hazard(x_haz, c)
fig.add_trace(go.Scatter(x=x_haz, y=y, mode="lines", name=f"c={c}"))
fig.update_layout(
title="Hazard rate h(x)=c x^{c-1} for different shapes (scale=1)",
xaxis_title="x",
yaxis_title="h(x)",
)
fig.update_xaxes(type="log")
fig.show()
# Scale effects at fixed c.
c = 2.0
scales = [0.5, 1.0, 2.0]
x = np.linspace(0, 6, 900)
fig = go.Figure()
for scale in scales:
y = weibull_min_pdf_loc_scale(x, c, loc=0.0, scale=scale)
fig.add_trace(go.Scatter(x=x, y=y, mode="lines", name=f"scale={scale}"))
fig.update_layout(
title="Scale stretches the distribution (fixed c=2)",
xaxis_title="x",
yaxis_title="f(x; c, scale)",
)
fig.show()
# CDF shapes (standard form).
x = np.linspace(0, 6.0, 800)
c_values = [0.5, 1.0, 1.5, 3.0]
fig = go.Figure()
for c in c_values:
fig.add_trace(go.Scatter(x=x, y=weibull_min_cdf(x, c), mode="lines", name=f"c={c}"))
fig.update_layout(
title="Weibull_min CDF for different shapes (scale=1)",
xaxis_title="x",
yaxis_title="F(x; c)",
)
fig.show()
6) Derivations#
We derive moments and the likelihood in the loc=0 case. (Location just shifts \(X\) and does not change variance or shape.)
6.1 Expectation and general moments#
Start from the PDF with loc=0:
For \(r>-c\),
Use the substitution \(u=(x/\lambda)^c \Rightarrow x=\lambda u^{1/c}\) and \(dx = \lambda\,\frac{1}{c}\,u^{1/c - 1}\,du\).
After cancellation, the integral becomes
Setting \(r=1\) and \(r=2\) yields mean and variance.
6.2 Variance#
Using \(\mathrm{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2\) and the moment formula:
6.3 Likelihood (i.i.d. sample)#
Let \(x_1,\dots,x_n\) be i.i.d. from a Weibull with parameters \((c,\lambda)\) and loc=0.
The likelihood is
The log-likelihood simplifies to
A useful fact: for fixed \(c\), the MLE of \(\lambda\) has a closed form:
Plugging \(\hat\lambda(c)\) back into \(\ell\) yields a profile likelihood in \(c\); the resulting score equation has to be solved numerically.
def weibull_min_loglik(c: float, scale: float, x: np.ndarray) -> float:
x = np.asarray(x, dtype=float)
if c <= 0 or scale <= 0 or np.any(x <= 0):
return -np.inf
n = x.size
return float(n * np.log(c) + (c - 1.0) * np.sum(np.log(x)) - n * c * np.log(scale) - np.sum((x / scale) ** c))
def weibull_min_scale_hat(c: float, x: np.ndarray) -> float:
x = np.asarray(x, dtype=float)
if c <= 0 or np.any(x <= 0):
return float("nan")
return float(np.mean(x**c) ** (1.0 / c))
def weibull_min_shape_score_profile(c: float, x: np.ndarray) -> float:
"""Score equation in c after profiling out scale (loc=0).
Root of this function gives the MLE for c.
"""
x = np.asarray(x, dtype=float)
if c <= 0 or np.any(x <= 0):
return float("nan")
logx = np.log(x)
x_c = x**c
return float(1.0 / c + np.mean(logx) - np.sum(x_c * logx) / np.sum(x_c))
# MLE demo in the standard loc=0 form.
c_true = 1.7
scale_true = 2.5
x = weibull_min_dist(c_true, scale=scale_true).rvs(size=4_000, random_state=rng)
# Find a bracket where the profile score changes sign.
c_grid = np.linspace(0.15, 8.0, 400)
score_vals = np.array([weibull_min_shape_score_profile(c, x) for c in c_grid])
idx = np.where(np.sign(score_vals[:-1]) * np.sign(score_vals[1:]) < 0)[0]
if idx.size == 0:
raise RuntimeError("Could not bracket the MLE root for c; try a wider grid.")
c_lo, c_hi = float(c_grid[idx[0]]), float(c_grid[idx[0] + 1])
sol = optimize.root_scalar(weibull_min_shape_score_profile, bracket=(c_lo, c_hi), args=(x,), method="brentq")
c_hat = float(sol.root)
scale_hat = weibull_min_scale_hat(c_hat, x)
print("True (c, scale):", (c_true, scale_true))
print("MLE (c, scale):", (c_hat, scale_hat))
# Compare to SciPy's fit (fix loc=0).
c_hat_sp, loc_hat_sp, scale_hat_sp = weibull_min_dist.fit(x, floc=0)
print("SciPy fit (floc=0):", (float(c_hat_sp), float(scale_hat_sp)))
# Profile log-likelihood over c (scale profiled out).
c_grid = np.linspace(0.3, 5.0, 250)
ll_prof = np.array([weibull_min_loglik(c, weibull_min_scale_hat(c, x), x) for c in c_grid])
fig = go.Figure(go.Scatter(x=c_grid, y=ll_prof, mode="lines", name="profile loglik"))
fig.add_vline(x=c_true, line_dash="dash", line_color="green", annotation_text="true c")
fig.add_vline(x=c_hat, line_dash="dash", line_color="red", annotation_text="MLE c")
fig.update_layout(title="Profile log-likelihood for c (loc=0)", xaxis_title="c", yaxis_title="log-likelihood")
fig.show()
True (c, scale): (1.7, 2.5)
MLE (c, scale): (1.681490527901736, 2.4991075083424734)
SciPy fit (floc=0): (1.6814619463263676, 2.499116178166809)
7) Sampling & Simulation#
NumPy-only algorithm (inverse transform)#
From the CDF in the loc=0 case:
Let \(U\sim\mathrm{Uniform}(0,1)\). Setting \(U=F(X)\) and solving for \(X\) gives
This is an exact sampler (no rejection needed) and is typically the fastest way to generate Weibull samples.
def weibull_min_rvs_numpy(
c: float,
*,
loc: float = 0.0,
scale: float = 1.0,
size: int | tuple[int, ...] = 1,
rng: np.random.Generator | None = None,
) -> np.ndarray:
"""Sample from weibull_min using NumPy only (inverse-CDF sampler)."""
if rng is None:
rng = np.random.default_rng()
if c <= 0 or scale <= 0:
raise ValueError("Require c>0 and scale>0")
u = rng.random(size=size)
# -log1p(-u) is stable when u is very close to 1.
x = scale * np.power(-np.log1p(-u), 1.0 / c)
return loc + x
8) Visualization#
We’ll compare:
the theoretical PDF and CDF
Monte Carlo samples (NumPy-only sampler)
SciPy’s implementation
c = 1.3
scale = 2.0
n = 80_000
x_np = weibull_min_rvs_numpy(c, scale=scale, size=n, rng=rng)
x_sp = weibull_min_dist(c, scale=scale).rvs(size=n, random_state=rng)
# Histogram vs theoretical PDF
x_grid = np.linspace(0, np.quantile(x_np, 0.995), 500)
pdf_grid = weibull_min_pdf_loc_scale(x_grid, c, loc=0.0, scale=scale)
fig = px.histogram(
x=x_np,
nbins=140,
histnorm="probability density",
title="Monte Carlo histogram (NumPy-only) vs theoretical PDF",
labels={"x": "x"},
)
fig.add_trace(go.Scatter(x=x_grid, y=pdf_grid, mode="lines", name="theoretical PDF"))
fig.show()
# Empirical CDF vs theoretical CDF
x_sorted = np.sort(x_np)
ecdf = np.arange(1, n + 1) / n
cdf_grid = weibull_min_cdf_loc_scale(x_grid, c, loc=0.0, scale=scale)
fig = go.Figure()
fig.add_trace(go.Scatter(x=x_sorted, y=ecdf, mode="lines", name="empirical CDF (NumPy-only)"))
fig.add_trace(go.Scatter(x=x_grid, y=cdf_grid, mode="lines", name="theoretical CDF"))
fig.update_layout(title="CDF: empirical vs theoretical", xaxis_title="x", yaxis_title="F(x)")
fig.show()
# Quick check: NumPy-only samples and SciPy samples should look similar.
from scipy.stats import ks_2samp
ks = ks_2samp(x_np, x_sp)
print("KS two-sample test (NumPy vs SciPy samples):")
print(ks)
KS two-sample test (NumPy vs SciPy samples):
KstestResult(statistic=0.004012499999999974, pvalue=0.5387311821802074, statistic_location=1.492943742605249, statistic_sign=-1)
9) SciPy Integration#
scipy.stats.weibull_min provides the standard distribution API:
weibull_min.pdf(x, c, loc=0, scale=1)weibull_min.cdf(x, c, loc=0, scale=1)weibull_min.rvs(c, loc=0, scale=1, size=..., random_state=...)weibull_min.fit(data, ...)(MLE)
A common workflow is to freeze the distribution: rv = weibull_min(c, loc=..., scale=...), then call rv.pdf, rv.cdf, rv.rvs, etc.
c = 1.8
loc = 0.0
scale = 3.0
rv = weibull_min_dist(c, loc=loc, scale=scale)
x = np.array([0.2, 1.0, 3.0, 8.0])
print("pdf:", rv.pdf(x))
print("cdf:", rv.cdf(x))
print("sf :", rv.sf(x))
samples = rv.rvs(size=5, random_state=rng)
print("rvs:", samples)
# Fitting: estimate (c, scale) with loc fixed to 0.
true_c, true_scale = 1.4, 2.2
data = weibull_min_dist(true_c, scale=true_scale).rvs(size=5_000, random_state=rng)
c_hat, loc_hat, scale_hat = weibull_min_dist.fit(data, floc=0)
print("\nFit (fixed loc=0):")
print(" true (c, scale):", (true_c, true_scale))
print(" est (c, scale):", (float(c_hat), float(scale_hat)))
pdf: [0.06823 0.21694 0.22073 0.00381]
cdf: [0.00761 0.12926 0.63212 0.9971 ]
sf : [0.99239 0.87074 0.36788 0.0029 ]
rvs: [2.36998 3.91892 3.90068 4.062 1.49376]
Fit (fixed loc=0):
true (c, scale): (1.4, 2.2)
est (c, scale): (1.3901687026330698, 2.2085788623684626)
10) Statistical Use Cases#
Hypothesis testing (goodness-of-fit)#
If the parameters are specified in advance (not fit from the same sample), you can test whether data plausibly comes from a Weibull distribution using a goodness-of-fit test such as Kolmogorov–Smirnov (KS).
Caveat: if you estimate parameters from the data and then run KS on the same data, the usual KS p-values are no longer exact (use a parametric bootstrap or a corrected procedure).
Bayesian modeling#
There is no simple conjugate prior for \((c,\lambda)\) jointly, but there is a convenient conjugate update when shape \(c\) is known.
If \(X\sim\mathrm{Weibull}(c,\lambda)\) with loc=0, then \(Y=X^c\) has
so \(Y\sim\mathrm{Exp}(\text{rate}=\beta)\) with \(\beta = 1/\lambda^c\).
A Gamma prior on the rate \(\beta\) is conjugate.
Generative modeling#
Weibull distributions are commonly used as generative models for survival times and as components of mixture models (e.g., to model early-failure and wear-out subpopulations).
# Hypothesis testing example: KS test when parameters are known.
from scipy.stats import kstest
c = 1.2
scale = 2.0
x = weibull_min_dist(c, scale=scale).rvs(size=2_000, random_state=rng)
D, p_value = kstest(x, weibull_min_dist(c, scale=scale).cdf)
print("KS test against Weibull(c=1.2, scale=2.0):")
print(" D =", D)
print(" p-value=", p_value)
KS test against Weibull(c=1.2, scale=2.0):
D = 0.03232217170754775
p-value= 0.029962541380471608
# Bayesian modeling (conjugate update) when shape c is known.
# Model: X ~ Weibull(c, scale=lambda), loc=0.
# Transform: Y = X^c ~ Exp(rate=beta) with beta = 1 / lambda^c.
# Prior: beta ~ Gamma(alpha0, rate=r0).
from scipy.stats import gamma as gamma_dist
rng_local = np.random.default_rng(123)
c_known = 1.6
lambda_true = 2.5
x = weibull_min_dist(c_known, scale=lambda_true).rvs(size=800, random_state=rng_local)
y = x**c_known
# Conjugate Gamma prior on beta (rate parameterization).
alpha0 = 2.0
r0 = 1.0
alpha_post = alpha0 + y.size
r_post = r0 + np.sum(y)
# Draw posterior samples for beta, then transform to lambda.
beta_samps = gamma_dist(a=alpha_post, scale=1.0 / r_post).rvs(size=50_000, random_state=rng_local)
lambda_samps = (1.0 / beta_samps) ** (1.0 / c_known)
ci = np.quantile(lambda_samps, [0.05, 0.5, 0.95])
print("True lambda:", lambda_true)
print("Posterior lambda 90% CI + median:", ci)
fig = px.histogram(
lambda_samps,
nbins=120,
histnorm="probability density",
title="Posterior over scale (lambda) with known shape c",
labels={"value": "lambda"},
)
fig.add_vline(x=lambda_true, line_dash="dash", line_color="green", annotation_text="true")
fig.show()
True lambda: 2.5
Posterior lambda 90% CI + median: [2.39838 2.48644 2.57815]
# Generative modeling example: early-failure vs wear-out mixture.
n = 60_000
# Early failures: c<1 (decreasing hazard), shorter characteristic life.
x_early = weibull_min_rvs_numpy(0.7, scale=0.8, size=n // 2, rng=rng)
# Wear-out: c>1 (increasing hazard), longer characteristic life.
x_wear = weibull_min_rvs_numpy(3.0, scale=2.0, size=n // 2, rng=rng)
x_mix = np.concatenate([x_early, x_wear])
fig = px.histogram(
x_mix,
nbins=160,
histnorm="probability density",
title="Mixture of Weibulls (early-failure + wear-out)",
labels={"value": "time"},
)
fig.show()
print("Mixture summaries:")
print(" mean =", float(x_mix.mean()))
print(" median =", float(np.median(x_mix)))
print(" 90% =", float(np.quantile(x_mix, 0.9)))
Mixture summaries:
mean = 1.4000915006124321
median = 1.3065326608485597
90% = 2.6408964289216006
11) Pitfalls#
Invalid parameters: require
c>0andscale>0. Withloc, support is \(x\ge \text{loc}\).Boundary behavior at 0:
if \(c<1\), the PDF and hazard blow up at 0 (this is a feature of the model, not a bug).
plots that include exactly \(x=0\) can show infinities; start from a small \(\varepsilon>0\).
Numerical stability:
use
logpdfwhen multiplying many densities or when probabilities are tiny;for \(x\) near 0, compute CDF via
-expm1(-t)(as done above);for \(q\) near 1, use
-log1p(-q)in the PPF (as done above).
Fitting with
loc:allowing
locto vary can lead to unstable fits or unintuitive parameter estimates;in reliability, it’s common to fix
loc=0unless a physical minimum lifetime is justified.
Model misspecification:
Weibull enforces a monotone hazard; if the true hazard is bathtub-shaped (decrease then increase), consider mixtures or more flexible survival models.
12) Summary#
weibull_minis a continuous distribution on \([0,\infty)\) parameterized by shape \(c>0\) and (optionally)locandscale.Its CDF has the simple form \(F(x)=1-\exp\{-(x/\lambda)^c\}\), leading to an exact inverse-CDF sampler.
The hazard rate is \(h(x)=(c/\lambda)(x/\lambda)^{c-1}\): decreasing for \(c<1\), constant for \(c=1\), increasing for \(c>1\).
Moments are expressed with the Gamma function: \(\mathbb{E}[X^r]=\lambda^r\Gamma(1+r/c)\).
scipy.stats.weibull_minprovides robust numerics for PDF/CDF/SF/PPF, sampling, and MLE fitting.
References#
Johnson, Kotz, and Balakrishnan. Continuous Univariate Distributions, Volume 1 (2nd ed.), Wiley, 1994.
Nelson. Applied Life Data Analysis, Wiley, 1982.
SciPy documentation:
scipy.stats.weibull_min.